Structure of In Vitro-Synthesized Cellulose Fibrils Viewed by Cryo-Electron Tomography and 13C Natural-Abundance Dynamic Nuclear Polarization Solid-State NMR

Cellulose, the most abundant biopolymer, is a central source for renewable energy and functionalized materials. In vitro synthesis of cellulose microfibrils (CMFs) has become possible using purified cellulose synthase (CESA) isoforms from Physcomitrium patens and hybrid aspen. The exact nature of these in vitro fibrils remains unknown. Here, we characterize in vitro-synthesized fibers made by CESAs present in membrane fractions of P. patens over-expressing CESA5 by cryo-electron tomography and dynamic nuclear polarization (DNP) solid-state NMR. DNP enabled measuring two-dimensional 13C–13C correlation spectra without isotope-labeling of the fibers. Results show structural similarity between in vitro fibrils and native CMF in plant cell walls. Intensity quantifications agree with the 18-chain structural model for plant CMF and indicate limited fibrillar bundling. The in vitro system thus reveals insights into cell wall synthesis and may contribute to novel cellulosic materials. The integrated DNP and cryo-electron tomography methods are also applicable to structural studies of other carbohydrate-based biomaterials.


■ INTRODUCTION
Cellulose is the most abundant biopolymer on Earth. It comprises the majority of plant biomass and serves as a major reservoir of renewable energy and functional biomaterials. 1−4 In the primary and secondary cell walls of plants, cellulose exists in the form of crystalline microfibrils, providing support and rigidity to the cells. 5 Chemically, each elementary cellulose microfibril (CMF) (3−4 nm across) is presumably assembled by 18 chains of β-1,4-glucans held together by numerous hydrogen bonds. 6,7 Elementary microfibrils further associate to form large bundles that are 10−20 nm across, which happens particularly often in secondary plant cell walls. 8 In the plant cell, each individual glucan chain is produced by the cellulose synthase (CESA) proteins located at the plasma membrane, from uridine diphosphate-glucose (UDP-α-D-Glc) substrate. 9,10 CESA units are themselves arranged in a larger hexagonal structure called the CESA complex (CSC). For decades, the exact number of glucan chains in a microfibril has remained elusive. Initially, a 36-chain model was proposed based on the assumption that each CSC might have a hexamer of hexamer organization. 11 However, diffraction and spectroscopic data supported smaller models with either 24 or 18 chains in each microfibril. 12−14 Most recent studies suggest that each lobe typically contains three CESAs. 6,15−17 Consequently, each CSC could polymerize up to 18 glucan chains at once, 18,19 making the 18-chain arrangement the bestaccepted model. Thereafter, spatial proximity between the newly synthesized chains allows the formation of CMFs due to electrostatic interactions (hydrogen bonding and van der Waals forces) relayed by hydroxide groups, 20 followed by a bundling process into larger fibrils. 21 To rationally engineer plants and tailor cellulose production to fulfill current needs for energy and material, an in-depth understanding of cellulose biosynthesis and assembly is needed. Previously, in vitro cellulose synthesis was reported using plant membrane fractions of blackberry, mung bean, hybrid aspen, and tobacco. 22−25 We have also successfully developed in vitro replication of cellulose biosynthesis starting from a UDP-glucose medium 26 and the solubilized protein from microsomes of Physcomitrium patens overexpressing CESA5 (or purified CESA5 or poplar CESA8 that were expressed in Pichia) purified and reconstituted into proteoliposomes. 27,28 Because linkage analysis confirmed the synthesis of mostly β-1,4-glucans chains, two questions have arisen. How are these in vitro-synthesized cellulose fibers assembled? Can the assembled fibers fully replicate the structure of microfibrils present in native cell walls? To begin answering these questions, we combined cryo-electron tomography (CET) and solid-state NMR to characterize the structure of in vitro fibers on the nanoscale and atomic levels, respectively. Achieving the solid-state NMR results required a 10-fold scaling up of the previously reported reaction protocol 26 and the use of magic-angle spinning (MAS) dynamic nuclear polarization (DNP) to enhance the NMR signal.
Recently, multidimensional solid-state NMR techniques have shown their capability of revealing the molecular structure of cellulose and its interactions with other biopolymers (such as hemicellulose, pectin, and lignin) in native plant cell walls and carbohydrate-based materials. 29−33 By coupling 13 C labeling of samples and high-field NMR, seven types of glucose units were consistently identified in the CMFs across the cell walls of a variety of plant genera, including Arabidopsis, Brachypodium, maize, rice, switchgrass, poplar, eucalyptus, and spruce. 34−36 None of these glucose units follow the 13 C chemical shifts of the bulk allomorphs, Iα and Iβ structures, 37 revealing a substantially deviated structure of cellulose when placed in the native context. However, the expected signals of Iα and Iβ allomorphs have been recently observed in cotton, indicating that model crystal structures are only possible in highly crystalline cellulose with large crystallites. 38 To apply NMR to reveal the structure of in vitro CMF, the methodology employed in previous plant studies must overcome two major challenges: the limited amount of biomaterial that can be obtained in vitro and the difficulty in 13 C-labeling these fibers. Here, we employ MAS-DNP to vastly enhance the NMR sensitivity and eliminate the need for isotope enrichment. High-resolution data provide both qualitative and quantitative information about in vitro-synthesized CMFs. 39−43 Subtomogram averaging of particles obtained via CET of in vitro-synthesized fibers showed them to contain two interwoven fibers, each about the size of an 18-chain CMF and of similar dimensions to the CET-based structure for CMF in cell walls of Arabidopsis 44,45 and onion. 46 For Arabidopsis, using Amira software to model fibers and measure distances, CMF with three types of cross-sectional areas were observed. 44,45 One type with a 3.5 nm diameter was circular, those of 5.0−5.5 nm were slight ovular extensions of the smaller circular shape, and those of 9−10 nm were oval with dimensions consistent with two adjacent smaller CMFs. Removing matrix materials reduced the larger ovals to 7 nm in diameter. In the onion study, rather than using a simple ruler, a full-width-at-halfmaximum (FWHM) approach was used to account for edge distortions caused by birefringence due to imaging cell walls at defocus. 46 The width of onion CMFs determined using this approach ranged between 5.3 and 6.3 nm. Consistently, the FWHM diameter values for in vitro-synthesized fibers that we describe below vary from 4.5 to 6.5 nm depending upon where along the fiber the size is measured.
Two-dimensional (2D) 13 C− 13 C correlation spectra enabled by MAS-DNP showed that the in vitro CMF largely retained the structural features of those microfibrils in intact Arabidopsis cell walls. Spectral deconvolution and intensity integration of CMF spectra allow comparing the in vitro fibers to previously proposed microfibril structures, with good agreement with the 18-chain arrangement in the microfibril cross-section. 47 The extensive cross-peaks of a spin-diffusion-based 2D spectrum also allow us to detail the conformers constituting CMF. These results not only shed light on the structure of elementary Biomacromolecules pubs.acs.org/Biomac Article CMFs but also present a novel strategy for analyzing the highresolution structures of unlabeled biomaterials.

■ EXPERIMENTAL SECTION
In Vitro CMF Sample Preparation. CMFs were produced in vitro following a method previously described, 26 using the moss P. patens 48 overexpressing HA-tagged P. patens CESA5. In Figure 1a,b, the growth stages of moss are schematically represented: gametophores will grow from the protonema, which contains two types of elongated cells: chloronema for photosynthesis and caulonema for substrate colonization and nutrient acquisition. 48 To provide sufficient in vitrosynthesized cellulose, 10 membrane preparations were combined and incubated for 24 h at room temperature in the reaction buffer containing 20 mM cellobiose (Figure 1c). For the negative control, 20 μL of the microsomal protein fraction was incubated in buffer lacking UDP-glucose and cellobiose. After incubation, the presence or absence of microfibrils was assessed by placing 3.5 μL from each incubation on carbon-coated copper grids, negatively stained with 0.75% uranyl formate and imaged using an FEI Tecnai 12 Spirit Biotwin transmission electron microscope [FEI; 120 kV; 6.3 mm spherical aberration (Cs); 4k × 4k eagle CCD camera]. The negative control showed no fibers, but they were abundantly present in the experimental sample ( Figure S1). To concentrate the microfibrils, 18 mL of the in vitro product was centrifuged at 50,000 rpm for 20 min in an Optima Max ultracentrifuge (Beckman Coulter, USA) using a rotor (TLA-100.3) and then discarding the supernatant. The pellet was resuspended in 10 μL of 100 mM MOPS buffer (pH 6.8). The wet weight of the fibers was about 17 mg.
Subtomogram Averaging. The synthesized CMFs were vitrified by plunge-freezing into liquid ethane using a Vitrobot (FEI), followed by data collection in a Titan Krios system (FEI; 300 kV) using a K3 detector (Gatan) (Figure 1d). Tilt movies of ten frames (5760 × 4096 pixels per frame) were collected dose-symmetrically from 0 to 60°and −60°in 3°increments and processed for motion and contrast transfer function correction using the program Warp. 49 A small subset of movies was collected with a phase plate, but only those collected without a phase plate were used for subtomogram averaging. The tilt images were aligned and reconstructed into tomograms using IMOD (version 4.12.8) with a rotational tilt-axis of −87°. 50 Results with the left-handed wrapping of two sub-fibrils are shown, but the actual handedness was not determined. Tomogram reconstruction utilized the default values from the Cryosample.adoc system template except for 2000 × 2000 patches being used for patch tracking and the use of 20 iterations of the SIRT-like filter to enhance the contrast of CMF for fiber annotation. Fiber annotation was performed in 3dmod using open contours placed on straight fibers, avoiding curved ones. Fiber widths were measured with the custom script "sideview-profileaverage" written using Ortega. 51 Model points were added every 126 pixels on each contour using the addModPts command from PEET (version 1.15.0). 52,53 At bin = 1, this spacing affords non-overlapping subtomograms containing CMF spanning 26.4 nm for averaging.
MAS-DNP Sample Preparation and Experiments. For atomiclevel characterization of the unlabeled CMF material, we employed a matrix-free protocol 54 to prepare the sample for DNP analysis. Briefly, the CMF material was mixed with a D 2 O/H 2 O mixture (3:1) and 10 mM of bi-nitroxide radical (AMUPol). 55 The sample was dried in a desiccator at room temperature for about 12 h to remove most D 2 O/ Biomacromolecules pubs.acs.org/Biomac Article H 2 O. Thereafter, 3 μL of D 2 O/H 2 O was added to provide partial moisture to the sample, which has been demonstrated previously to be the key factor in achieving satisfactory DNP enhancement ( Figure  1e). 38,54,56−58 All spectra of the CMF sample were acquired on a 600 MHz (14.1 T) Bruker spectrometer with a 395 GHz gyrotron for microwave generation for DNP enhancement (Figure 1f). The microwave irradiation was 12 W. The sample was packed in a thin-walled 3.2 mm rotor, which was spun at 8 kHz MAS. The temperature at the stator was ∼100 K with microwave irradiation and decreased to 93 K when the microwave was off. The 13 C chemical shifts are calibrated on the tetramethylsilane (TMS) scale.
For 1D 1 H− 13 C cross-polarization magic-angle spinning (CP-MAS) experiments, Hartmann−Hahn conditions matched an average 13 C field of 50 kHz (90 to 110% ramp) with a 1 H field of 50 kHz during a 1 ms contact time. The DNP buildup time was measured to be 3 s; therefore, the recycle delay was set to 3.9 s for 1D experiments. 256 scans (17 min) and 32 scans (2 min) were collected for the 1D spectra under microwave-off and microwave-on conditions, respectively. Without applying any window function that would broaden the spectra during processing, the DNP spectrum displayed linewidths approximated at a maximum of 2.8 ppm for partially resolved cellulose peaks. Spectral deconvolution was performed on the 95 to 30 ppm region using DMFit. 59 The low chemical shift limit of the fit was chosen to show the baseline, while the higher cutoff was placed before the C1 signals. A minimum number of spectral components was chosen to fit the C4 region.
Two types of 2D 13 C− 13 C correlation spectra were measured on the unlabeled CMF: a 2D refocused INADEQUATE spectrum that reports single quantum (SQ)−double quantum (DQ) correlations 60−62 and a 2D CHHC spectrum that exhibits SQ−SQ correlations. 63 The recycle delays were between 3.0 and 3.9 s. For CP-based refocused J-INADEQUATE, a total of 608 scans were recorded in 44 h over three repetitive experiments, with 74−80 points in the indirect dimension. For the CHHC spectrum, a total of 336 scans were recorded in 18 h over three repetitive experiments, with 26 to 58 points in the indirect dimension. The CP contact times for the first H−C CP, the second C−H CP, and the third H−C CP were 1000, 500, and 500 μs, respectively. A 1 H− 1 H mixing period of 2 ms was used. The spectra presented here are the summations of all spectra for each experiment.
Solid-State NMR of Arabidopsis Cell Walls. 2D 13 C− 13 C correlation solid-state NMR spectra were collected on uniformly 13 Clabeled Arabidopsis samples for comparison with the DNP spectra collected on the unlabeled in vitro CMF. Isolation of the primary cell wall has been previously performed for intact and digested material. 64,65 1D 13 C CP and 2D 30 ms proton-driven spin diffusion (PDSD) spectra were collected on both the digested and intact primary Arabidopsis cell walls on an 800 MHz NMR spectrometer under 13.5 kHz MAS frequency. The results were also compared with the spectra collected on a secondary Arabidopsis cell wall sample. 66    . Most measurements of the repetition were within the range of 24−29 nm, with a mean periodicity of 26.7 ± 3.1 nm, as measured from the raw tomograms ( Figure 2d). Fibers often ran parallel to one another, although isolated CMF were regularly seen. Parallel alignment may be attributed to the forces experienced during blotting of the grids immediately prior to plunge-freezing.
Using the script sideview-profile-average to generate a 1D profile of density across the fiber illustrated in Figure 2c (green lines), the diameter of a single in vitro fiber was 7.0 nm (fullwidth, FW approach; Figure 2e). Since it is difficult to deal with birefringence due to imaging at defocus, Nicolas et al. 46 used the FWHM method to measure microfiber diameter in cryo-electron tomograms of onion peels. For the single in vitro fiber analyzed in Figure 2e, the FWHM was 4.8 nm. In this measurement, the length of the long green guideline spanned more than one periodic repeat, so the density profile represents an average along the length of the fiber and does not reveal any possible variation in density along the fiber. Shorter guidelines (cyan lines in Figure 2c) were then used to measure profile densities at different segments along 100 fiber locations, the averages of which are shown in Figures 2f and S2a.
When placing the guidelines randomly along fibers, a broad profile was obtained (cyan), indicating a fiber diameter of 5.6 nm (FWHM). This profile was seen to be a sum of two distinct profiles when the guidelines were placed at 50 of the darker regions that were noted in  (Figures 2f and S2c). The diameter of in vitro fibers thus varied periodically between 4.5 and 6.5 nm (FWHM), and the profile of the larger dimension could be modeled as two Gaussian peaks with FWHM diameters of 3.1 ± 0.1 and 2.7 ± 0.1 nm, respectively ( Figure S2d). The smaller dimensions are close to the ∼3.5 nm width reported for CMF of plant cell walls measured by other methods 67 and the larger dimension is nearly twice that size but very similar to the larger 6−10 nm subclasses of CMF observed by CET of cell walls present in Arabidopsis stems. 45 To better explore the structure of in vitro fibers, we performed subtomogram averaging. Given the measured periodicity, model points were placed every 26.5 nm along the long axis of filaments in 12 different tomograms, and then these points were used to extract subtomograms. From all model points, 4377 subtomograms (52 × 126 × 52 pixels) were obtained, aligned to an initial reference, and then averaged and re-aligned iteratively to obtain a 25 Å resolution (FSC = 0.5) density map (Figure 3a). This average captured one apparent periodic unit of an in vitro CMF, which contained a pair of parallel fibers wrapping around one another.
To further characterize the repetitious nature of the CMF, a similar average was computed but with the particle size doubled along the long axis of the filament (52 × 256 × 52 pixels). Only every other model point was used so that the larger subtomograms did not overlap. This strategy gave half the particle count and a slightly reduced resolution (29.8 Å, FSC = 0.5), but it allowed us to view one complete 360°turn of the wrapping fibers (Figure 3c,d). At the center of this average, a crossover point between the fiber-pair is visible.
While wrapping, the electron density between fibers dropped over part of the trajectory and then increased over the rest of the run (as shown in Figure 3a and quantified by sideviewprofile-averaging in Figure 3b). The FWHM values for the two regions were 5.2 and 5.7 nm, respectively. Analysis of the bimodal profile of the subtomogram average yielded two Gaussian peaks with FWHM values of 2.5 ± 0.1 and 3.3 ± 0.1 nm (Figure S2e), which are very similar to the values described above for individual fibers in the tomograms (Gaussian peaks of 2.7 and 3.1 nm, Figure S2d). In the larger subtomogram average, the two sub-filaments appeared to crossover edge-toedge in the high-density regions and face-to-face in the lowdensity regions (as illustrated in Figure 3e,f, Movie S1, and Figure S3). We propose that the two sub-filaments helically wrap but do not twist around the long axis of their trajectory.
To our knowledge, the first CET of vitrified lamellae of plant cell walls was recently submitted for publication. 46 The results of subtomogram averages of that study are directly comparable to those for the in vitro fibers being described here. FWHMs of the wrapped pair (5.2 and 5.7 nm at locations 1 and 2 in Figure  3b) fall near or within the 5.3 to 6.3 nm range of FWHMs reported for the onion CMF. 46 In Figure 3b, the sideviewprofile-average of CMFs in untreated onion walls is overlain with those of the in vitro-synthesized fibers. While of a very similar size, the in vitro fibers are not identical to the onion CMFs, as seen by the monomodal versus bimodal profiles depending on where along the in vitro fiber one looks. This feature could have been overlooked in the onion study, or could reflect structural differences due to in vitro versus in vivo synthesis conditions and the different resolutions achieved in the subtomogram averages. Using the FWHM measurements and taking each sub-filament of the in vitro fiber to be of equal size, each sub-filament is about 2.9 nm in diameter and shaped as expected for a modeled 18-chain crystalline cellulose microfiber, which we fit into the density map using constrained rigid body methods (Figure 3e). Together, the wrapped pair presents an oval cross-sectional area like the larger oval-shaped CMFs reported for Arabidopsis cell walls. 45 Unfortunately, we have not obtained images of the in vitro CESA making glucan chains or of the chains coalescing into in vitro CMF. Likewise, the in vivo synthesis process has not been defined at such a level of detail. We thus do not know how faithfully the in vitro assembly process reflects the in vivo process. One potential difference is in the oligomerization state of the CESAs. So far, preparations of detergent-solubilized functional CESAs have yielded mixtures of monomers, dimers, and trimers, the latter giving rise to cryo-EM structures of putative CSC lobes, 6,68 but higher-order assemblies like the hexamer of trimers seen in freeze-fracture TEM images of CSCs in plant cells have not been achieved in vitro. Also, higher-resolution structures are required to confirm or refute the possibility that the in vitro-synthesized fibers are crystalline cellulose. If they are, it is possible that the edge-to-edge fiber interactions seen here may contribute to the bundling of CMFs in plant cell walls. Below, we present DNP-assisted solid-state NMR data for the in vitro fibers that show them to be very similar to CMF in Arabidopsis cell walls but contain little bundling.
DNP-Enabled Solid-State NMR Characterization of In Vitro Fibers. High-resolution structural characterization of the in vitro-synthesized CMF is technically challenging due to the lack of isotope-labeling and the low quantity of materials available for analysis (17 mg wet mass). Therefore, the enabling technique DNP is required to boost the NMR sensitivity by transferring polarization from the electrons to the nuclei. 69−74 This will enable the use of the low natural abundance of 13 C (1.1%) to measure multi-dimensional Biomacromolecules pubs.acs.org/Biomac Article correlation spectra to probe the atomic-level structure of in vitro CMF. As summarized in Figure 1e,f, after in vitro synthesis by microsomal fraction enriched for CESA proteins, 26 CMF is subjected to mixing with bi-radicals (AMUPol), the source of electrons for DNP, followed by DNP measurements on a 600 MHz/395 GHz instrument at cryogenic temperature. The DNP technique is not detrimental to the analysis of biological cellulosic materials. Also, the spectral resolution of these highly crystalline microfibrils is largely retained at cryogenic temperature and after biradical addition as we have shown previously. 38,40,75 Here we achieved a 15-fold increase in signal-to-noise as denoted by the enhancement factor ε on/off , which represents the ratio of peak intensities with and without microwave irradiation (Figure 4a). The pattern of the 1D 13 C CP DNP spectrum generally followed that of the room-temperature spectra of 13 C labeled cell walls of the model plant Arabidopsis (Figure 4b). Thus, the general spectral features of the interior and surface cellulose were resolved, notably in the C4 region with chemical shifts centered around 89 ppm for interior cellulose carbon-4 (i4) and 84 ppm for surface cellulose carbon-4 (s4). The other two domains of interest, according to their resolution, are the C1 peak at 105 ppm and the C6 signals at 65 and 62 ppm. As these chemical shifts are indicators of torsional conformations (e.g., the χ torsion angle: O5−C5−C6−O6) 76 and hydrogen-bonding patterns, the resemblance of spectral patterns has revealed the structural similarity of the glucose residues in the in vitro CMF and the plant cell wall CMFs.
In addition, the DNP spectrum also showed a carbonyl peak at 174 ppm and a methyl peak at 21 ppm, mutually assigned to the acetyl group of cellulose acetate. 77 This derivative may be a consequence of enzymatic acetylation by the isolated protein apparatuses, which is a mixture of many membrane proteins contained by the detergent-solubilized microsomal fraction of protoplast membranes. Otherwise, it could be due to acetate formation in the short time lapse between DNP radical addition to the sample and its freezing. 78,79 However, such a feature has never been observed in previous DNP samples of plant cell wall materials; it remains unclear if the in vitrosynthesized CMF has higher reactivity.
Cellulose relies on its crystallinity to maintain narrow NMR linewidths; therefore, cellulose signals are only moderately broadened by the cryogenic temperature during DNP experiments. 75 In contrast, most non-cellulosic molecules, such as the matrix polysaccharides in plant cell walls, exhibit dramatically broadened signals at ∼100 K. For those dynamic molecules, a broad distribution of conformations will be entrapped (thus giving broad lines) when molecular motions are restricted under the DNP condition. The biradicals doped to the material preferentially partition into the solvent, using relayed 1 H spin diffusion for hyperpolarization of molecules in the range of tens to hundreds of nanometers. 41 The linebroadening effect by paramagnetic relaxation enhancement thus becomes minimal as assessed in multiple studies. 38,40,75 While the number of glucan chains in cellulose has been under debate, mounting evidence from biochemical assays, imaging, modeling, and protein crystallography supports the concept that 18 chains should co-exist in an elementary microfibril. 6 Density functional theory (DFT) calculations also suggest that each elementary microfibril might contain six layers of glucans in a 2-3-4-4-3-2 arrangement (Figure 5a). 80 Solid-state NMR studies have recently revealed the torsional conformation of surface and interior chains (trans-gauche for Biomacromolecules pubs.acs.org/Biomac Article interior chains and gauche-trans for surface chains) and have distinguished hydrophilic (s f ) and hydrophobic (s g ) surfaces. 76 Spectral deconvolution was conducted using DMFit 59 to analyze the composition of glucan chains, with a good agreement reached between the experimental and calculated spectra (Figure 5b; Tables S1 and S2). This fit was obtained while accounting for a major component at 89.2 ppm for interior cellulose but required two major peaks at 83.3 and 84.7 ppm for surface cellulose (Figure 5c). The complexity in data fitting indicates that in vitro CMF has generally retained the structural heterogeneity of cellulose in plants.
Two weak components were also identified in the deconvoluted spectrum (Figure 5c). The 87.6 ppm signal has a similar chemical shift to the type-c cellulose recently identified in intact plant cell walls. 34−36 In plants, this special conformer belongs to some glucan chains that are deeply embedded in the core of a fibril, thus becoming spatially separated from surface chains. These chains cannot be accommodated by a small 18-chain microfibril; therefore, they might be created during the microfibril bundling process, which produces larger fibrils. The weak component of surface cellulose (81.5 ppm) was not well understood. A possible origin would be the presence of some more amorphous or less organized chains residing on the microfibril surface.
For better resolution, 2D correlation spectra were acquired on the unlabeled CMF, as enabled by the DNP technique. The refocused INADEQUATE spectrum collected on unlabeled CMFs ( Figure 5d) and 13 C-labeled Arabidopsis cell walls (Figure 5e) were highly comparable. This cell wall sample also has signals from non-cellulosic molecules such as xylan. The 13 C chemical shifts of all resolved carbon sites in CMF have been summarized in Table S3. The structural similarity is further supported by the overlay of the tilted version of the refocused INADEQUATE spectrum of in vitro CMF with the 2D 13 C− 13 C correlation spectrum of Arabidopsis cell walls (Figure 5f). Moreover, conformer-specific information was obtained from the C4 region, where positions of type-f and type-g surface cellulose C4 can be distinguished (Figure 5d). Within a short measurement time of 44 h, the 2D spectrum of unlabeled CMF has provided excellent resolution and sensitivity as evidenced by the extracted cross-sections ( Figure  5g). The representative 13 C FWHM linewidth is 1.8 ppm, with reasonably strong signals that are far beyond the noise level.
As the sample preparation procedures involved the mixing of CMF with cellobiose, a disaccharide formed by two glucose units, we need to determine if cellobiose has contributed to the signals of 2D spectra. The expected signals of the two glucose units of cellobiose 81 deviate from the observed spectra, especially in the C1−C2 regions (Figure 5d). Cellulose has a high degree of polymerization through the C1−O−C4 covalent linkages. In contrast, the C1 of C′ glucose residue in cellobiose is not covalently linked to other sugar units, resulting in a unique C1 13 C chemical shift at 96 ppm. The signals of cellobiose have been broadened out by the broad distribution of conformations trapped at a low temperature; therefore, the DNP method selectively probes the highly crystalline component (CMF) in the sample. In addition, it is noteworthy that the expected chemical shifts of the model structures Iα and Iβ allomorphs 37 do not match the measured spectra ( Figure S4). This observation has further confirmed our previous findings that the model crystallographic structures cannot exist in cellulose fibers with small crystallite dimensions. The cellulose in most plant cell walls, as well as the in vitro CMF, does not follow the model structures characterized by diffraction methods.
Spectral integration of resolved peaks in the 2D refocused INADEQUATE spectrum (Table S4) and deconvolution of the 1D 13 C CP spectrum (Table S1) are simultaneously conducted to quantify two structural aspects of in vitro CMF: the interior-to-surface ratio and the ratio of hydrophilic (typef) and hydrophobic (type-g) surfaces. These two ratios shed light on the structure of CMF. The percentages of different glucan chains estimated from 1D and 2D experimental data are generally consistent with the numerical values predicted by the initial 18-chain model of CMFs (Table 1). The results indicate Interior-to-surface ratios of CMF and the percentages of different surface conformers yielded from the theoretical model (from Figure 5a), peak volumes of 2D spectrum (from Figure 5d), and peak area of deconvoluted lines (from Figure 5c). Note that the i and s add up to 100% (all glucan chains are in a CMF). The s f and s g add up to 100% (all surface chain possible conformers). For 1D deconvolution, only resolved C4 and C6 signals are used. For 2D spectral analysis, all resolved resonances listed in Table S4 are   that the in vitro CMFs are mainly present as individual microfibrils instead of larger bundles.
The NMR analysis has a considerable error margin that cannot be avoided. This is because of the limited resolution in 1D spectra and the non-quantitative nature of 2D NMR (notably from the differences in T 2 relaxation time constants for many carbon sites). For analysis based on peak volumes from 2D spectra, we have averaged all the resolved resonances (detailed in Table S4) to reduce the uncertainty. For analysis based on the area of deconvoluted C4 peaks in the 1D spectrum, three different ways were used to understand the error margin (Table 1). First, the interior cellulose content was estimated to be 29% if only the three major C4 peaks (i4 at 89.2 ppm, s f 4 at 84.7 ppm, and s g 4 at 83.3 ppm) were used for the calculation. Second, including the contribution of the weak peak at 87.6 ppm increased the content of interior cellulose to 35%. This minor component probably correlates with the typec cellulose in plant cell walls, which belongs to a special form of glucan chains deeply embedded in the center of a bundle of microfibrils. 34,36 The increase in the surface-to-interior ratio might reflect the structural effect of the bundling of microfibrils. Third, including the area of the 81.5 ppm peak (likely from some highly disordered surface chains) in the calculation will bring down the percentage of interior cellulose back to 29%. While these estimations based on C4 peak intensities gave a relatively good match to the model, the analysis based on C6 peaks gave a poor correlation, which is likely caused by the limited resolution of C6 signals.
Finally, we attempt to further probe the CMF structure by acquiring an SQ−SQ correlation spectrum (Figure 6a). Under the natural abundance of 13 C, most 2D SQ−SQ correlation methods are not functional: the spectra will be dominated by the diagonal as it is almost improbable for a 13 C to correlate with another 13 C to generate off-diagonal cross-peaks. However, the CHHC experiment chosen here will sufficiently suppress the diagonal due to the 13 C− 1 H− 1 H− 13 C transfer pathway. 82 This experimental scheme describes spatial correlations. Therefore, its sensitivity is substantially worse than the refocused INADEQUATE spectrum that only shows through-bond correlations. In total, 18 h of measurement are needed to obtain a satisfactory signal-to-noise ratio for the CHHC spectrum. The CHHC spectrum reports 18 one-bond cross-peaks (e.g., s4−s5) and 28 multi-bond cross-peaks (e.g., i1−i6) (Table S5). All cross-peaks involving interior and surface cellulose are well-resolved. In addition, three interglucan cross-peaks were observed, which happened between the interior chain carbon 6 and the carbon 4 of hydrophobic surface chains (i6-s g 4) and between the carbon-6 sites of the internal and surface chains (s6−i6 and i6−s6).
A few spectral regions of CMF were compared with the 2D 13 C− 13 C correlation spectra of 13 C-labeled Arabidopsisdigested primary cell walls (Figure 6b). The Arabidopsis spectrum was presented with two types of window function processing: one with a squared sine bell (QSINE) window function that enhances resolution, and one with 100 Hz exponential (EM) broadening that partially enhances the signal-to-noise ratio and mimics CMF spectra. The Arabidopsis spectra showed the signals of different interior cellulose conformers, mainly type-c and type-a/b, which, respectively, correspond to the deeply embedded core chains and those intermediate layers sandwiched between the core and surface chains as we have resolved in previous studies.
With the current resolution, type-f and type-g surface conformers are easily differentiated for in vitro CMF, but this is not the case for interior cellulose conformers. Only one very weak peak shoulder could be partially observed in the C2,3,5− C6 region of the CMF CHHC spectrum. According to Arabidopsis data, this shoulder peak corresponds to a minor contribution of interior cellulose, c conformer, C6, which is present when the average structure of cellulose exceeds 18 chains, for example, through the association of multiple microfibrils. Its presence also explains why a minor additional component is necessary for 1D deconvolution (thick gray dashed line at 87.6 ppm in Figure 5c). As cellulose is rich in hydroxyl groups, chain bundling could be expected under the mediation of electrostatic interactions. The low intensity indicates that only a very low degree of bundling has occurred between different CMFs, which could occur either between very few chains fully parallel or between limited regions dispersed along the fibrils. This latter statement agrees with the tomography observation of the wrapping arrangement of CMF, leading to mostly individualized fibers and localized areas with higher cellulose densities (Movie S1 and Figure S3). As the interfibrillar association and sliding in the bundled cellulose networks regulate cell wall mechanics, 83 understanding such interactions could guide the development of in vitro biomaterials with tunable properties.

■ CONCLUSIONS
This study has presented CET subtomogram averaging of in vitro-synthesized CMFs and a MAS-DNP solid-state NMR method for characterizing their atomic-level structure without isotope-labeling and with a significantly limited quantity of material. DNP sensitivity enhancement has enabled the measurements of high-resolution 2D 13 C− 13 C correlation spectra to resolve different glucan chains and quantify their populations in in vitro CMFs. Although synthesized in vitro, these CMFs have effectively retained the native structure of CMFs in plant cell walls. Quantification of peak intensities is in good agreement with the 18-chain cellulose model. Fibrillar bundling only occurs at a minimal level in in vitro CMFs, but there is an edge-to-edge interaction that might contribute to bundling. The methods are widely applicable to the structural elucidation of many other carbohydrate-based biomacromolecules such as functionalized cellulose-and lignocellulose-based fibers as well as in vitro-synthesized cell walls and biomaterials.
Comparison of three subtomogram averages; surface rendering of an 18-chain model of crystalline cellulose placed above the average for non-overlapping subtomograms of length 26.5 nm (left), 53.8 nm (middle), and 53.8 nm but with 50% overlap (right); scalebar is one average repeat length (26.7 nm) and below it is listed the density threshold for the current view; TEM of in vitro-synthesized fibrils; sideview-profile-averages of in vitro fibers and subtomogram average; snapshots of movie S1; overlay of CMF spectra with expected signals of Iα and Iβ cellulose; spectral deconvolution parameters; 13